This interactive plot compares the predictive performance of two modeling approaches—Linear Regression and Random Forest—in estimating movie box office revenue. The layout consists of three side-by-side plots:
All models are using budget, genre, rating, number of cast and production companies / countries.
Despite expectations, including audience sentiment did not noticeably improve model accuracy, suggesting that sentiment scores may not provide additional predictive value in this context.
A toggle button above the plots allows you to switch between models that include audience sentiment scores as a feature and those that do not.
Figure: Each panel displays an interactive scatter plot with a dashed 45° reference line indicating perfect prediction. Points closer to the diagonal suggest better model performance. This visual comparison helps assess whether including sentiment information improves prediction quality.
This interactive dual-panel visualization displays the top 10 words used in movie reviews across different genres. Use the dropdown menu to explore specific genres.
Left panel shows the most frequently used words in reviews for the selected genre.
Right panel highlights the top TF-IDF (Term Frequency–Inverse Document Frequency) words—terms that are uniquely representative of the genre.
Hover over each bar to view the exact word frequency or TF-IDF score. This comparison helps reveal not only what audiences talk about most often, but also what language is most distinctive to each genre.
Figure: Each panel shows a horizontal bar chart of the top 10 words in movie reviews for the selected genre. The left chart ranks by raw frequency, while the right highlights words with the highest TF-IDF scores—indicating terms uniquely important to that genre.
This interactive visualization shows how movie genres differ in the emotions they evoke from audiences.
Genres like Family, Comedy, and Animation top the list with the highest average sentiment scores, reflecting more positive viewer experiences.
In contrast, genres such as Crime and War tend to score lower.
Hover over each bar to see the exact average sentiment score by genre.
Figure: Interactive bar chart of average audience sentiment across movie genres. Bars are sorted by average sentiment score, providing a quick visual comparison of how positively different genres are perceived.
Audience reviews reveal rich genre-specific language, from emotional tones to iconic keywords — showcasing how viewers engage with films through both sentiment and story.
Family, Comedy, and Animation genres spark the most positive emotions, while darker genres like Crime and Horror trend negative — yet sentiment scores don’t significantly boost prediction accuracy.
The full report (PDF) reveals deeper insights, including the dominant role of budget, nonlinear effects, and how genre interacts with production scale to influence box office revenue.